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(54) Methods and apparatus for reducing interference in a branch history table of a 
microprocessor 



(57) Interference in a branch history table (214) of a 
microprocessor is reduced by methods (300, 400) and 
apparatus (200) which predict the outcome of branch 
instructions (taken or not taken) through a combination 
of static and dynamic prediction techniques. Static pre- 
diction information (e.g., a compiler hint) may be stored 
in instruction memory (204), and dynamic prediction 
information is stored in a branch history table (214). A 
branch prediction (302, 406) results from an exclusive 
OR (216) of static (220) and dynamic (226) prediction 
information. After execution of a branch instruction, an 
indication (222) as to whether a branch was taken or not 
taken is exclusively ORed (212) with the static predic- 
tion information (220) for the branch instruction, and the 
result (218) of this exclusive OR (212) is used to update 
(304, 408) an appropriate entry in the branch history 
table (214). Using the methods (300, 400) and appara- 
tus (200) disclosed herein, two well-behaved branches 
may share an entry in a branch history table (214), yet 
not interfere with one another (even when the two well- 
behaved branches include one which is mostly taken, 
and one which is mostly not taken). 
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Description 

Field of the Invention 



thereby yielding 1) more accurate branch predictions, 
and consequently 2) fewer delays caused by errone- • 
ously predicted branches. 



[0001] The invention pertains to the maintenance and 
use of a branch history table in a microprocessor. . 

Background of the Invention 

[0002] Most modern computers, including those that 
execute instructions out-of-order and/or via a pipelined 
execution unit, execute instructions "speculatively". 
That is, instructions are executed before the instructions 
on which they depend have been fully executed, and 
quite possibly, before the outcomes of branches in the 
instruction stream are known. To achieve a high degree 
of performance, the microprocessors in these comput- 
ers employ a variety of techniques to minimize the cost 
of erroneously predicted branches in the instruction 
stream. These techniques usually involve some form of 
"branch prediction". Branch prediction is a means of . 
optimizing for the outcome of a branch instruction which 
is mostly likely to occur (either "taken" or "not taken"). 
[0003] Typically, a branch prediction will be based oh 
one of two types of information: 1 ) static prediction infor- 
mation, or 2) dynamic prediction information. Static pre- . 
diction information is generated prior to the execution of 
a computer program, and may be based on factors such 
as instruction type, position in the instruction stream, 
instruction repetition, and so on. Dynamic prediction 
information is generated during the execution of a com- 
puter program, and usually depends on a history of pre- 
vious outcomes of a given branch and/or other branch 
instructions. 

[0004] Dynamic prediction information is stored in a 
branch history table comprising a number of entries. If a 
branch history table was large enough, it is conceivable 
that a distinct history could.be maintained for each 
branch instruction of a computer program. However, 
given that microprocessor chip area is a costly 
resource, and that branch history tables are often 
scaled back to make room for other important micro- 
processor elements, entries in a branch history table 
are often shared. Interference between conflicting 
branch histories is therefore a significant problem. 
[0005] When conflicting histories share a single entry 
in a branch history table, the history for any given 
branch instruction is often corrupted by other branch 
instructions, thereby resulting in a mispredicted branch 
outcome. When a branch outcome is mispredicted, seri- 
ous and costly consequences result. For example, 
instruction pipelines may stall, instruction execution 
units may be halted, caches and registers may need to 
be flushed, and so on. All of these, consequences result 
in unacceptable delays. 

[0006] It is therefore a primary object of this invention 
to provide methods and apparatus which reduce inter- 
ference in a branch history table of a microprocessor, 



5 Summary of the Invention 

[0007] To understand the invention, it must first be rec- 
ognized that the vast majority of branches are either 
"almost always taken" or "almost always not taken". 

10 These branches may be referred to as "well-behaved" 
branches. One must also recognize that when the out- 
come of a branch switches, it often switches from 
"almost always taken" to "almost always not taken", or 
vice versa. It is also important to note that branch pre- 

75 diction schemes typically rely on the assumption that 
most branches are well-behaved. As a result, the goal of 
both static and many dynamic branch prediction 
schemes is to predict what the dominant outcome of a 
branch will be. 

20 [0008] Recognizing the above facts, one can appreci- 
ate that the prediction accuracy of a well-behaved 
branch of one type (e.g., an "almost always taken" 
branch) is degraded when the branch shares a branch 
history table entry with a well-behaved branch of the 

25 other type (e.g., an "almost always not taken" branch). 
[0009] The branch prediction schemes of the Hewlett- 
Packard Company PA-8x00 family of microprocessors 
(e.g.. the PA-8000, PA-8200. and PA-8500) presume the 
above facts on well-behaved branches to be true. 

30 Hewlett-Packard Company is based in Palo Alto, Cali- 
fornia, USA, and the PA-8x00 family of microprocessors 
is described in more detail in Advanced Performance 
Features of the 64-bit PA-8000 by D, Hunt (March 5, 
1995), HP Pumps Up PA-8x00 Family: PA-8200 in 

35 2Q97, PA-8500 in 2Q98 Aim to Grab Performance Lead 
by L. Gwennap (October 28, 1996), and PA-8500: The 
Continuing Evolution of the PA-8000 Family by G. 
Lesartre and D. Hunt (Feb. 23, 1997). These papers are 
hereby incorporated by reference for all that they dis- 

40 close. 

[0010] Compilers which generate code for the PA- 
8x00 family of microprocessors are capable of encoding 
a "hint" in most branch instructions. These hints are a 
form of static prediction information, and are indication 

45 as to whether the compiler believes a given branch will 
be mostly taken or mostly not-taken. The compiler for 
the PA-8000 microprocessor is described in more detail 
in Compiler Optimizations for the PA-8000 by A. Holler. 
This paper is hereby incorporated by reference for all 

so that it discloses. 

[001 1 ] In the achievement of the foregoing objects, the 
inventor has devised methods and apparatus which uti- 
lize these compiler generated hints (or any other static 
prediction information) to insure that two or more well- 

55 behaved branches sharing a single entry in a branch 
history table do hot corrupt the history information 
stored therein. After execution of a branch instruction, 
an indication as to whether a branch instruction resulted 
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in a branch being taken or not taken is exclusively ORed 
with the compiler generated hint for the branch instruc- 
tion, and the result of this exclusive OR is used to 
update an appropriate entry in the branch history table. 
Furthermore, the outcome of a branch instruction is pre- 5 
dieted in response to the exclusive OR of 1) the com- 
piler generated hint, and 2) dynamic prediction 
information read from an appropriate entry of the 
branch history table. 

[0012] Using the methods and apparatus disclosed 10 
herein, two well-behaved branches may share an entry 
in the branch history table, yet not corrupt the history 
information stored therein (even when the two well- 
behaved branches comprise one which is mostly taken, 
and one which is mostly not taken). is 
[001 3] These and other important advantages and 
objectives of the present invention will be further 
explained in, or will become apparent from, the accom- 
panying description, drawings and claims. 

20 

Brief Description of the Drawings 

[001 4] An illustrative and presently preferred embodi- 
ment of the invention is illustrated in the drawjngs in 
which: 2s 

FIG. 1 illustrates a first embodiment of branch pre- 
diction hardware; 

FIG. 2 illustrates a second embodiment of branch 
prediction hardware; 30 
FIG. 3 illustrates a method of predicting outcomes 
of a plurality of branch instructions executed in a 
microprocessor; and 

FIG. 4 illustrates a method of reducing interference 
in a branch history table of a microprocessor. 35 

Description of the Preferred Embodiment 

[001 5] Apparatus 200 in a microprocessor for predict- 
ing whether branches identified in a plurality of branch 40 
instructions will be taken or not taken is illustrated in 
FIG. 2, and may generally comprise a branch history 
table 214, one or more data storage locations 204 for 
storing static prediction information corresponding to a 
plurality of branch instructions, and first 216 and second 45 
212 logicgates. The branch history table 21 4 comprises 
a plurality of entries. The first logic gate 216 comprises 
an input for receiving static prediction information 220 
derived from an addressed one of the one or more data 
storage locations 204, an input for receiving information so 
226 derived from at least one entry in the branch history 
table 214, and a branch prediction output 224 which is 
indicative of whether one of the plurality of branch 
instructions will be taken or not taken. The second logic 
gate 212 comprises an input for receiving static predic- ss 
Won information 220 derived from an addressed one of 
the one or more data storage locations 204, an input for 
receiving information 222 which is indicative of whether 



a branch identified in a branch instruction was taken or 
not taken, and a branch history update output 218 which 
is indicative of whether the static prediction information 
corresponding to a branch instruction was correct. The 
branch history update output 218 is received by the 
branch history table 214. 

[0016] A method 300 of reducing interference in a 
branch history table 214 of a microprocessor (which 
might utilize the above described apparatus 200) is illus- 
trated in FIG. 3. and may generally comprise predicting 
302 outcomes of a plurality of branch instructions in a 
computer program, and updating 304 an entry in a 
branch history table 214 after execution of a given 
branch instruction. The outcomes of branch instructions 
are predicted 302 at least partly in response to hints 
encoded in the branch instructions, and entries in the 
branch history table 21 4. The branch history table 21 4 is 
updated 304 at least partly in response to whether the 
hint encoded in the given branch instruction was cor- 
rect. 

[001 7] A method 400 of predicting outcomes of a pi u- 
rality of branch instructions executed in a microproces- 
sor which might (which also might utilize the above 
described apparatus 200) is illustrated in FIG. 4, and 
may generally comprise 1) maintaining 402 a branch 
history table 214 comprising a plurality of entries. 2) 
maintaining 404 static prediction information for a plural- 
ity of branch instructions, 3) predicting 406 outcomes of 
the plurality of branch instructions, and 4) updating 408 
an entry in the branch history table 214 after execution 
of each of the plurality of branch instructions. Outcomes 
of branch instructions are predicted 406 at least partly in 
response to 1) the static prediction information 220, and 
2) an entry in the branch history table 214. The branch 
history table 214 is updated at least partly in response 
to whether the static prediction information 220 was cor- 
rect. 

[0018] Having described the above methods 300. 400 
and apparatus 200 in general, the methods 300, 400 
and apparatus 200 will now be described with more par- 
ticularity.. 

[0019] A high-level schematic of branch prediction 
hardware 100 existing in Hewlett-Packard Company's 
PA-8000 and PA-8200 microprocessors, which serves 
as a basis for implementing the preferred embodiments 
of the methods 300, 400 and apparatus 200 disclosed 
herein, is illustrated in FIG. 1. In general, the branch 
prediction hardware 1 00 of these computers comprises 
an instruction fetch unit 102, an instruction memory 
104, an instruction execution unit 106, and a branch his- 
tory table 1 10. 

[0020] The instruction fetch unit 102 generates 
addresses of instructions to be executed in response to 
inputs from the instruction memory 104, the branch his- 
tory table 110, and the instruction execution unit 106. 
The first of these inputs (i.e., the one received from 
instruction memory 104) iallows the instruction fetch unit 
102 to determine if a previously addressed instruction 
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was a branch instruction. If so, the second of the inputs 
(i.e., the one received from the branch history table 110) 
provides dynamic prediction information which allows 
the instruction fetch unit 102 to determine whether a 
branch identified by the branch instruction is mostly 
taken or mostly not taken. The third of the inputs (i.e., 
the one received from the instruction execution unit .106) 
provides an indication as to whether a branch was taken 
or not taken. If the instruction fetch unit 102 determines 
that it erroneously predicted the outcome of a branch 
instruction, then steps must be taken to clear the 
instruction pipeline, and otherwise recover from the 
erroneous prediction. 

[0021] Addresses generated by the instruction fetch 
unit 102 are provided, to both the instruction memory 
104 and the branch history table 110. To provide for 
pipelined instruction execution, hold 108 is several 
entries deep, and allows an appropriate entry in the 
branch history table 1 10 to be addressed subsequent to 
a branch instruction's processing through a pipeline in 
instruction execution unit 106. 

[0022] The same indication as to whether a branch 
was taken or not taken is also provided to the branch 
history table 110, and serves to increment, decrement, 
or add to the data stored in an entry of the branch his- 
tory table 110 addressed by hold 108. 
[0023] In the PA-8000 microprocessor, each entry in. 
the branch history table 110 is maintained by a 3-bit 
shift register which stores a taken/not taken history of 
one or more branches. If a branch is taken, a logic "1" is 
moved into an appropriate shift register. If a branch is 
not taken, a logic "0" is moved into an appropriate shift 
register. A branch prediction generated by the branch 
history table 1 10 is a logic "1 " (meaning mostly taken) if 
any two of an addressed shift register's history bits hold 
a logic "1". Otherwise, the branch prediction is a logic 
"0" (meaning mostly not taken). 

[0024] In the PA-8200 microprocessor, each entry in 
the branch history table 1 10 is maintained by a 2-bit sat- 
urating up/down counter. If a branch is taken, an appro- 
priate counter is incremented. If a branch is not taken, 
an appropriate counter is decremented. Of course, 
when a counter has reached its maximum count (i.e, 
saturation), additional increment attempts will have no 
effect on the counter. Likewise, when a counter has 
reached its minimum count (i.e, saturation), additional 
decrement attempts will have no effect on the counter. A 
branch prediction generated by the branch history table 
1 10 is equal to the most significant bit (MSB) of a coun- 
ter. 

[0025] A problem with the branch history tables 1 1 0 of 
both the PA-8000 and PA-8200 microprocessors is that 
when a single entry is shared by more than one branch, 
and one of the branches is mostly taken, while another 
is mostly not taken, conflicts result and the outcome of a 
branch instruction can be predicted incorrectly. 
[0026] Referring now to FIG. 2, which illustrates a pre- 
ferred embodiment of the invention, one will note the 



appearance of instruction fetch unit 202, instruction 
memory 204, instruction execution unit 206, hold 208, 
and branch history table 214. These components oper- 
ate similarly to those illustrated in FIG: 1, and may, in 
5 fact, be identical to those illustrated in FIG. 1 . 

[0027] As in FIG. 1 , one output of instruction fetch unit 
202 is an instruction address. This address is provided 
to the instruction memory 204 for retrieval of an instruc- 
tion stored therein, and is further provided to the branch 
10 history table 214 for retrieval of historical prediction 
information relating to an addressed branch instruction. 
[0028] Hold device 208 may comprise one or more 
registers which latch the instruction addresses gener- 
ated by instruction fetch unit 202. Since branch predic- 
15 tion is typically only necessary in out-of-order and/or 
pipelined computer systems, hold device 208 will most 
likely comprise a plurality of registers which maintain 
the addresses of recently fetched instructions (most 
likely just branch instructions). In this manner, an appro- 
ve priate entry in the branch history table 214 may be 
addressed and updated several cycles after a branch 
instruction is addressed (e.g., after the branch instruc- 
tion has advanced through a pipeline of instruction exe- 
cution unit 206). Means may be provided for clearing or 
. 25 advancing entries in hold device 208 upon execution or 
retirement of an instruction. 

[0029] The instruction memory 204 may be a cache 
which is internal or external to a microprocessor, or in 
the alternative, may be part of a main memory. It is also 

30 possible that instruction memory 204 may comprise a 
combination of caches and main memory. 
[0030] The instruction execution unit 206 may be any 
one or more of an integer arithmetic logic unit (integer 
ALU), a floating-point multiply accumulate unit (FMAC), 

35 a shift/merge unit, a divide/square-root unit 
(divide/SQRT), an instruction reorder buffer (IRB), or 
other execution unit. The instruction execution unit 206 
produces a signal 222 which is indicative of whether a 
branch identified in a branch instruction was taken or 

40 not taken. 

[0031] In the preferred implementation, static predic- 
tion information is saved 404 (FIG. 4) in one or more 
data storage locations of instruction memory 204, and is 
saved as a predecode bit stored in conjunction with var- 

45 ious of the instructions saved in the instruction memory 
204. The decode unit 210 therefore reads the appropri- 
ate predecode bit 220, and routes same to logic gates 
212 and 216. A hold device 228, possibly similar to hold 
device 208, allows static information to be provided to 

so logic gate 212 subsequent to a branch instruction's 
processing through a pipeline in instruction execution 
unit 106. 

[0032] Logic gate 21 2 . provides a branch history 
update signal 218 to branch history table 214. Inputs to 
55 the logic gate 21 2 comprise static prediction information 
220 from decode unit 210, and the signal 222 from 
instruction execution unit 206 which is indicative of 
whether a branch identified in a branch instruction was 
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taken or not taken. In a preferred embodiment, logic 
gate 212 is a single exclusive OR gate (XOR gate). If, 
for example, static prediction information 220 comprises 
a plurality of bits, or prediction history information main- 
tained in the branch history table 214 depends on fac- 5 
tors other than static prediction information 220 and the 
success thereof, logic gate 212 might comprise a more 
complex XOR gate, or even an alternate form of logic 
gate. 

[0033] Branch history table 214 comprises a plurality 
of entries. Each entry may be maintained 402 (FIG. 4) in 
a number of ways. For example, an entry may be main- 
tained by a saturating up/down counter, a shift register, 
or other means of latching data. In a preferred embodi- 
ment, each entry of the branch history table is main- 
tained by either a 2-bit up/down saturating counter (as 
in the HP-8200), or a 3-bit shift register (as in the HP- 
8000). 

[0034] Assuming a branch history table 214 of 2-bit 
counters, the branch history update signal 218 (or just 
"Update Signal" in the following table) advances a coun- 
ter from its current state to a next state as illustrated in 
the following table: 



Current State 


Update Signal 


Next State 


00 


0 


00 


00 


1 


01 


01 


0 


00 


01 


1 


10 


10 


0 


01 


10 


1 


11 


11 . 


0 


10 


11 


1 


11 



[0035] When a counter has reached its maximum 
count, additional increment signals (logic "1"s in the 
above example) have no effect on the counter (i.e., the 
counter is saturated, and will no longer increment). Like- 
wise, when a counter has reached its minimum count, 
additional decrement signals (logic "0"s in the above 
example) have no effect on the counter. 
[0036] If the branch history table 214 comprises a 
number of 2-bit counters, the information 226 output to 
logic gate 216 might comprise only the most significant 
bit (MSB) of an addressed counter. If, on the other hand, 
entries in the branch history table 214 are maintained by 
3-bit shift registers, the information 226 output to logic 
gate 216 might comprise the MSB of a sum of a shift 
register's bits. 

[0037] Control logic for reading and/or writing a 
branch history table 214 is known in the art, and is 



beyond the scope of this disclosure. The branch history 
table 214 illustrated in FIG. 2 is presumed to include 
such control logic. 

[0038] Logic gate 216 provides a branch prediction 
output 226, which is indicative of whether a branch 
instruction will be taken or not taken, to instruction fetch 
unit 202. The branch prediction output 226 comprises 
dynamic prediction information. The outcome of a 
branch instruction addressed by instruction fetch unit 
202 is therefore predicted in response to both static 220 
and dynamic 226 prediction information (unless a com- 
puter program which is being executed was not com- 
piled with static prediction information - in this case, the 
outcome of a branch instruction addressed by instruc- 
tion fetch unit 202 will only be predicted in response to 
dynamic prediction information 226). Inputs to the logic 
gate 216 comprise static prediction information 220 
from decode unit 210, and information 226 derived from 
at least one entry in the branch history table 214. In a 
preferred embodiment, logic gate 216 is a single XOR 
gate. But again, for example, if static prediction informa- 
tion 220 comprises a plurality of bits, or if prediction of a 
branch instruction's outcome depends on factors other 
than static prediction information 220 and information 
226 maintained in the branch history table 214, logic 
gate 216 might comprise a more complex XOR gate, or 
even an alternate form of logic gate. 
[0039] If it is desirable that a computer system be able 
to execute computer programs which do not comprise 
static prediction information, then instruction fetch unit 
202 might comprise, or be responsive to, a branch pre- 
diction mode indicator (not shown) which signals 
whether a computer program does or does not com- 
prise static prediction information. If a computer pro- 
gram does not comprise static prediction information, 
then signal 220 can be driven to a logic "0" so that 1) the 
outputs 218, 224 of logic gates 212 and 216 are solely 
dependent on the taken/not taken information 222 pro- 
vided by instruction execution unit 206. Note that in this 
mode, entries in the branch history table 214 which are 
shared by more than one branch are subject to update 
by branches having conflicting outcomes, and the prob- 
ability of erroneous branch prediction is increased. 
[0040] Assuming that the branch history table 214 
comprises a number of 2-bit saturating up/down 
counters, a logic "0" hint indicates that a branch is 
mostly taken, a logic "1" hint indicates that a branch is 
mostly not taken, a logic "1" signal indicates that a 
branch identified in a branch instruction was actually 
taken, and a logic "0" signal indicates that a branch was 
actually not taken, the apparatus 200 shown in FIG. 2 
operates as follows. When a first well-behaved branch is 
hinted as taken (hint 0), and is actually taken (outcome 
1), an appropriate counter in the branch history table 
214 is incremented (0 XOR 1 = 1). Likewise, when a 
second branch is hinted' as not taken (hint 1). and is 
actually not taken (outcome 0), an appropriate counter 
in the branch history table 214 is incremented (1 XOR 0 
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= 1). If the first and second branches share the same 
entry (e.g., counter) in the branch history table, the his- 
tories for these branches will not interfere with one 
another. For example, assume that a counter holds an 
initial value of "00", and is then incremented once after 
the first branch is taken as hinted, and once after the 
second branch is not taken as hinted. After two incre- 
ments, the counter holds a value of "10". If the first 
branch is again hinted as taken (hint 0), and the MSB of 
the counter is read as "1", logic gate 216 produces a 
branch prediction of "1" (0 XOR 1=1). which is inter- 
preted by the instruction fetch unit 202 to mean taken. 
Alternatively, if the second branch is again hinted as not 
taken (hint 1). and the MSB of the counter is read as "1 
logic gate 216 produces a branch prediction of "0* (1 
XOR 1 = 0), which is interpreted by the instruction fetch 
unit 202 to mean not taken. 

[0041 ] Of course, if the hints were always correct, then 
branch prediction could be based solely on the hints, 
and branch prediction hardware 200 would be unneces- 
sary. However, assume now that the first arid second 
branches were hinted correctly during their first execu- 
tion, and that their shared counter in the branch history 
table stands at "10". If the first branch now switches its 
behavior, and becomes mostly not taken, its incorrect 
hint (hint 0) is exclusively ORed with its outcome of not 
taken (outcome 0). and the counter is decremented (0 
XOR 0 = 0). With the counter now standing at "01", a 
subsequent prediction of the first branch, which has 
now become mostly not taken, would be correct (the 
MSB of the counter is now "0", and when exclusively 
ORed with a hint of "0" produces a logic "0" which 
results in a prediction of not taken - in spite of the hint). 
[0042] To summarize, the primary advantage of the 
methods 300, 400 and apparatus 200 disclosed herein 
is that when two well-behaved branches which share an 
entry in a branch history table 214 are hinted correctly, 
dynamic prediction histories for the two branches will 
not interfere with one another - even when one of the 
branches is mostly taken, and the other is mostly not 
taken. Also, when a branch is not hinted correctly, but 1) 
does not share an entry in the branch history table 214 
with any other branch, or 2) is executed 302, 406 (FIGS. 
3, 4) repeatedly (or significantly more often than the 
branch which shares its entry in the branch history table 
214), then the dynamic information stored in the branch 
history table 214 allows the instruction fetch unit 202 to 
make a correct prediction of a branch outcome in lieu of 
the incorrect hint. The only time well-behaved branches 
will interfere with each other is when two branches 
share an entry in the branch history table 214, and one 
is hinted correctly while the other is hinted incorrectly. 
This is in contrast to previous branch prediction 
schemes, wherein two well-behaved branches, one of 
which is mostly taken and one of which is mostly not 
taken, will always interfere with one another if they 
share an entry in the branch history table 214. 
[0043] While illustrative and presently preferred 



embodiments of the invention have been described in 
detail herein, it is to be understood that the inventive 
concepts may be otherwise variously embodied and 
employed, and that the appended claims are intended 
5 to be construed to include such variations, except as 
limited by the prior art. 

Claims 

10 1 . A method (400) of predicting outcomes of a plurality 
of branch instructions executed in a microproces- 
sor, comprising: 

a) maintaining (402) a branch history table 
75 (214) comprising a plurality of entries; 

b) maintaining (404) static prediction informa- 
tion for a plurality of branch instructions; 

c) predicting (406) outcomes of the plurality of 
branch instructions, each outcome being pre- 

20 dieted at least partly in response to: 

i) the static prediction information (220); 
and 

ii) an entry in the branch history table; 

25 

d) after executing each of the plurality of branch 
instructions, updating (408) an entry in the 
branch history table at least partly in response 
to whether the static prediction information was 

30 correct. 

2. A method (400) as in claim 1 , further comprising: 

a) maintaining each entry in the branch history 
35 table by means of a saturating up/down coun- 
ter; and 

b) after executing a given one of the plurality of 
branch instructions, 

40 i) incrementing a predetermined saturating 

up/down counter in the branch history 
table (214) if the static prediction informa- 
tion (220) for the given one of the plurality 
of branch instructions was correct; and 

45 ii) decrementing a predetermined saturat- 

ing up/down counter in the branch history 
table if the static prediction information for 
the given one of the plurality of branch 
instructions was incorrect. 

50 

3. A method (400) as in claim 1 , further comprising: 

a) maintaining each entry in the branch history 
table (214) by means of a shift register; and 
55 b) after executing a given one of the plurality of 

branch instructions, shifting into a predeter- 
mined shift register of the branch history table 
an indication (222) as to whether the static pre- 
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diction information for the given one of the plu- 
rality of branch instructions was correct. 

A method (400) as in claim 1 , wherein predicting 
(406) the outcome of a given one of the plurality of 
branch instructions comprises predicting the out- 
come of the given one of the plurality of branch 
instructions at least partly in response to the exclu- 
sive OR (216) of: 



10 



a) static prediction information (220) corre- 
sponding to the given one of the plurality of 
branch instructions; and 9. 

b) an entry in the branch history table (214). 



75 



20 



35 



A method (400) as in claim 4, wherein updating 
(408) an entry in the branch history table (214) 
comprises updating an entry in the branch history 
table at least partly in response to the exclusive OR 
(212) of: 

a) static prediction information (220) corre- 
sponding to a given one of the plurality of 
branch instructions: and 

b) an indication (222) as to whether execution 
of the given one of the plurality of branch 
instructions resulted in a branch being taken or 
not taken. 

A method (400) as in claim 1 ( wherein updating 
(408) an entry in the branch history table (214) 
comprises updating an entry in the branch history 
table at least partly in response to the exclusive OR 
(212) of: 

a) static prediction information (220) corre- 
sponding to a given one of the plurality of 
branch instructions; and 

b) an indication (222) as to whether execution 
of the given one of the plurality of branch 
instructions resulted in a branch being taken or 
not taken. 

A method (300) of reducing interference in a branch 
history table (214) of a microprocessor, comprising: 

a) predicting (302) outcomes of a plurality of 
branch instructions in a computer program, at 
least partly in response to: 

i) hints encoded in the branch instructions; 
and 

ii) entries in a branch history table; and 



b) after execution of a given branch instruction, ss 
updating (304) an entry in the branch history 
table at least partly in response to whether the 
hint encoded in the given branch instruction 



25 10. 



30 



40 



45 



50 



was correct. 

A method (300) as in claim 7, wherein predicting 
(302) the outcome of a given branch instruction 
comprises predicting the outcome of the given 
branch instruction at least partly in response to the 
exclusive OR (216) of: 

a) a hint (220) encoded in the given branch 
instruction; and . 

b) an entry in the branch history table (214). 

A method (300) as in claim 7, wherein updating 
(304) an entry in the branch history table (214) 
comprises updating an entry in the branch history 
table at least partly in response to the exclusive OR 
(212) of: 

a) a hint (220) encoded in a given branch 
instruction; and 

b) an indication (222) as to whether execution 
^ of the given branch instruction resulted in a 

branch being taken or not taken. 

Apparatus (200) in a microprocessor for predicting 
whether branches identified in a plurality of branch 
instructions will be taken or not taken, comprising: 

a) a branch history table (214) comprising a 
plurality of entries; 

b) one or more data storage locations (204) for 
storing static prediction information corre- 
sponding to a plurality of branch instructions; 

c) a first logic gate (216), comprising: 

i) an input for receiving static prediction 
information (220) derived from an 
addressed one of the one or more data 
storage locations; 

ii) an input for receiving information (226) 
derived from at least one entry in the 
branch history table; and 

iii) a branch prediction output (224) which 
is indicative of whether one of the plurality 
of branch instructions will be taken or not 
taken; 

d) a second logic gate (212), comprising: 

i) an input for receiving static prediction 
information derived from an addressed 
one of the one or more data storage loca- 
tions; 

ii) an input for receiving information (222) 
which is indicative of whether a branch 
identified in a branch instruction was taken 
or not taken; and 

iii) a branch history update output (218) 
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which is indicative of whether the static 
prediction information corresponding to a 
branch instruction was correct; 

wherein the branch history update output is 5 
received by the branch history table. 

11. Apparatus (200) as in claim 10, wherein the first 
(216) and second (212) logic gates are exclusive 
OR gates. ™ 



15 



20 



25 



30 



35 



40 



50 



55 



8 

BNSDOCID: <EP 093Pr^i.--.-> • > 



EP 0 938 044 A2 



INSTRUCTION 
FETCH UNIT 



102 



I 



INSTRUCTION 
MEMORY 



104 



INSTRUCTION 
EXECUTION 
UNIT 



106 



HOLD 
^708 



BRANCH 
HISTORY 
TABLE 



110 




100 

FIG. 1 



BNSDOCID: <EP *0938044A2_I_> 



9 



EP 0 938 044 A2 



INSTRUCTION 
FETCH UNIT 



202 



INSTRUCTION 
MEMORY 



204. 



a 



XL 



212 



HOLD 



208 



HOLD 



228 

u 



DECODE 
UNIT 



220 

V— 



■210 



r 



224 




INSTRUCTION 
EXECUTION 
UNIT 



206 



222 



218 



1 



n 



216 



BRANCH 
HISTORY 
TABLE 



226 
V 



214 



FIG. 2 



10 



200 



EUMSDOC'D: <EP 0938OdAA2J_> 



EP 0 938 044 A2 



START 



AT LEAST PARTLY IN RESPONSE TO HINTS 
ENCODED IN A BRANCH INSTRUCTION, AND 
ENTRIES IN A BRANCH HISTORY TABLE. 
PREDICT THE OUTCOME OF EACH OF A 
PLURALITY OF BRANCH INSTRUCTIONS 



h 

302 



AFTER EXECUTING A GIVEN BRANCH 
INSTRUCTION, UPDATE AN ENTRY IN THE 
BRANCH HISTORY TABLE IN RESPONSE TO 
WHETHER THE HINT ENCODED IN THE GIVEN 
BRANCH INSTRUCTION WAS CORRECT 



300 



304 



END 



J 



FIG. 3 



BNSOOCID: <EP 0938044A2_I_> 



11 



EP 0 938 044 A2 



START 



± 



MAINTAIN BRANCH 
HISTORY TABLE 
COMPRISING A PLURALITY 
OF ENTRIES 



h 

402 



r 



404 



1 



MAINTAIN STATIC 
PREDICTION INFORMATION 
FOR A PLURALITY OF 
BRANCH INSTRUCTIONS 



I 



AT LEAST PARTLY IN RESPONSE TO THE 
STATIC PREDICTION INFORMATION, AND 
ENTRIES IN THE BRANCH HISTORY TABLE, 
PREDICT THE OUTCOME OF EACH OF THE 
PLURALITY OF BRANCH INSTRUCTIONS 



406 



AFTER EXECUTING EACH OF THE PLURALITY 
OF BRANCH INSTRUCTIONS, UPDATE AN 
ENTRY IN THE BRANCH HISTORY TABLE IN 

RESPONSE TO WHETHER THE STATIC 
PREDICTION INFORMATION WAS CORRECT 



400 



408 



END 



FIG. 4 



BNSOOCID: <EP 0938044A2 I > 



12 



(19) 



J 



(12) 



Europaisches Patentamt 
European Patent Office 
Off ice europ^en des brevets (11) EP 0 938 044 A3 

EUROPEAN PATENT APPLICATION 





L/alc UI (JUUllLrctllUI 1 MO. 


(oi ) int. Lf\. . uuor woo 




uo.Uo.^uuu ouiieiin zuuu/ 1 u 






DatP c\\ nuhJif^tinn A?" 










(21) 


Application number: 981 1 51 01 .2 




(22) 


Date of filing: 11.08.1998 




(84) 


Designated Contracting States: 


(72) Inventor: Hunt, Douglas B. 




AT BE CH CY DE DK ES F1 FR GB GR IE IT LI LU 


Fort Collins, Colorado 80526 (US) 




MCNLPTSE 






Designated Extension States: 


(74) Representative: 




AL LT LV MK RO SI 


Schoppe, Fritz, Dipl.-lng. 






Schoppe, Zimmermann & Stockeler 


(30) 


Priority: 23.02.1998 US 28258 


Patentanwafte 






Postfach 71 08 67 


(71) 


Applicant: 


81458 Munchen (DE) 




Hewlett-Packard Company 






Palo Alto, California 94304 (US) 





CO 

< 

o 

00 
CO 
O) 

o 

LU 



(54) Methods and apparatus for reducing interference in a branch history table of a 
microprocessor 

(57) Interference in a branch history table (214) of a 
microprocessor is reduced by methods (300, 400) and 
apparatus (200) which predict the outcome of branch 
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indication (222) as to whether a branch was taken or not 
taken is exclusively ORed (212) with the static predic- 
tion information (220) for the branch instruction, and the 
result (218) of this exclusive OR (212) is used to update 
(304, 408) an appropriate entry in the branch history 
table (214). Using the methods (300, 400) and appara- 
tus (200) disclosed herein, two well-behaved branches 
may share an entry in a branch history table (214), yet 
not interfere with one another (even when the two well- 
behaved branches include one which is mostly taken, 
and one which is mostly not taken). 
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